Model Selection

Zero-shot classification

# Zero-shot classification

The Teacher V 2

This is a transformers model for zero-shot classification tasks, which can classify text without a large amount of labeled data.

Text Classification

Clip Vitb16 Test Time Registers

A vision-language model based on the OpenCLIP-ViT-B-16 architecture. By introducing test-time registers to optimize the internal representation, it solves the problem of feature map artifacts.

Industry Project V2

An instruction fine-tuned model optimized based on the Mistral architecture, suitable for zero-shot classification tasks

Large Language Model

FG-CLIP is a fine-grained vision and text alignment model that achieves global and region-level image-text alignment through two-stage training, enhancing fine-grained visual understanding ability.

Multimodal Alignment

Transformers English

Smartshot Zeroshot Finetuned V0.2.0

A zero-shot classification model fine-tuned from MoritzLaurer/deberta-v3-base-zeroshot-v2.0-c using the SmartShot method with an NLI framework

Text Classification English

Smartshot Zeroshot Finetuned V0.1.2

A zero-shot classification model fine-tuned based on roberta-base-zeroshot-v2.0-c, enhanced with SmartShot method and synthetic data

Text Classification Other

An emotion detection model fine-tuned based on mDeBERTa-v3, supporting emotion classification in Indonesian and English

Text Classification

Transformers Supports Multiple Languages

Modernbert Large Nli

A natural language inference model optimized through multi-task fine-tuning based on the ModernBERT-large model, excelling in zero-shot classification and NLI tasks.

Large Language Model

Transformers Supports Multiple Languages

Modernbert Base Zeroshot V2.0

A zero-shot classifier fine-tuned based on ModernBERT-base, efficient and fast with low memory usage, suitable for various text classification tasks.

Text Classification

Modernbert Large Zeroshot V2.0

A zero-shot classifier fine-tuned based on ModernBERT-large, efficient and fast with low memory usage, suitable for various text classification tasks.

Large Language Model

Llm Jp Clip Vit Large Patch14

A Japanese CLIP model trained based on the OpenCLIP framework, trained on a dataset of 1.45 billion Japanese image-text pairs, supporting zero-shot image classification and image-text retrieval tasks

Text-to-Image Japanese

Resnet50x64 Clip Gap.openai

CLIP model image encoder based on ResNet50 architecture with 64x width expansion, using Global Average Pooling (GAP) strategy

Image Classification

Resnet50x16 Clip Gap.openai

A ResNet50x16 variant model based on the CLIP framework, focused on image feature extraction

Image Classification

Resnet50x4 Clip Gap.openai

ResNet50x4 variant model based on the CLIP framework, designed for image feature extraction

Image Classification

Resnet50 Clip Gap.openai

A ResNet50 variant based on the visual encoder part of the CLIP model, extracting image features through Global Average Pooling (GAP)

Image Classification

Resnet50 Clip Gap.cc12m

CLIP-style image encoder based on ResNet50 architecture, trained on CC12M dataset, extracting features through Global Average Pooling (GAP)

Image Classification

Vit Large Patch14 Clip 224.dfn2b

A vision transformer model based on the CLIP architecture, focused on image feature extraction, released by Apple.

Image Classification

Modernbert Large Zeroshot V1

A natural language inference model fine-tuned based on ModernBERT-large, specifically designed for zero-shot classification tasks

Text Classification

Transformers English

Vit Huge Patch14 Clip 224.laion2b

ViT-Huge visual encoder based on the CLIP framework, trained on the laion2B dataset, supports image feature extraction

Image Classification

Vit Base Patch32 Clip 224.laion2b

Vision Transformer model based on CLIP architecture, designed for image feature extraction, trained on the laion2B dataset

Image Classification

Convnext Base.clip Laiona

ConvNeXt Base model based on the CLIP framework, trained on the LAION-Aesthetic dataset, suitable for image feature extraction tasks.

Image Classification

Modernbert Base Nli

ModernBERT is a model fine-tuned on multi-task natural language inference (NLI) tasks, excelling in zero-shot classification and long-context reasoning.

Large Language Model

Transformers Supports Multiple Languages

Llm Jp Clip Vit Base Patch16

Japanese CLIP model trained on OpenCLIP framework, supporting zero-shot image classification tasks

Text-to-Image Japanese

Deberta Zero Shot Classification

A zero-shot text classification model fine-tuned on DeBERTa-v3-base, suitable for scenarios with scarce labeled data or rapid prototyping.

Text Classification

Transformers English

LLM2CLIP Openai L 14 224

LLM2CLIP is an innovative approach that leverages large language models (LLMs) to unlock the potential of CLIP. It enhances text discriminability through a contrastive learning framework, breaking the limitations of the original CLIP text encoder.

LLM2CLIP Openai B 16

LLM2CLIP is an innovative method that leverages large language models (LLMs) to extend CLIP's capabilities, enhancing text discriminability through a contrastive learning framework and significantly improving cross-modal task performance.

Bart Large Mnli Openvino

This is the OpenVINO optimized version of the facebook/bart-large-mnli model for zero-shot text classification tasks.

Text Classification

Vit Base Patch16 Clip 224.metaclip 2pt5b

A dual-framework compatible vision model trained on the MetaCLIP-2.5B dataset, supporting both OpenCLIP and timm frameworks

Image Classification

Resnet50 Clip.yfcc15m

ResNet50 model trained on the YFCC-15M dataset, compatible with both open_clip and timm frameworks, supporting zero-shot image classification tasks.

Image Classification

Deberta Small Long Nli

A compact zero-shot classification model based on DeBERTa architecture, optimized for long-text natural language inference tasks, converted to ONNX format for web compatibility

Text Classification

EXLMR is an extended version of XLM-R that supports new languages by expanding the tokenizer vocabulary to mitigate out-of-vocabulary issues, specifically optimized for low-resource Ethiopian languages.

Large Language Model

Transformers Other

Marqo Fashionsiglip

Marqo-FashionSigLIP is a multimodal embedding model optimized for fashion product search, with a 57% improvement in MRR and recall rate compared to FashionCLIP.

Transformers English

Deberta V3 Nli Onnx Quantized

Quantized ONNX model based on DeBERTa-v3-base, suitable for zero-shot text classification tasks

Text Classification

Transformers English

Safeclip Vit H 14

Safe-CLIP is an enhanced vision-language model designed to mitigate risks associated with Not Safe For Work (NSFW) content in AI applications.

Gliclass Large V1.0

An efficient zero-shot classifier trained on synthetic data, suitable for topic classification, sentiment analysis, and reranking tasks in RAG workflows.

Text Classification

Transformers English

Gliclass Base V1.0

GLiClass is an efficient zero-shot classifier inspired by GLiNER, suitable for text classification, sentiment analysis, and reranking tasks in RAG workflows.

Text Classification

Transformers English

Gliclass Small V1.0

An efficient zero-shot classifier trained on synthetic data, suitable for topic classification, sentiment analysis, and reranking tasks in RAG workflows

Text Classification

Transformers English

Gliclass Base V1.0 Lw

GLiClass is an efficient zero-shot classifier trained on synthetic data, suitable for text classification, sentiment analysis, and reranking tasks in RAG workflows.

Text Classification

Transformers English

A zero-shot classification model based on the Transformers library, capable of performing classification tasks without task-specific training data.

Text Classification

Deberta Base Long Nli

Based on the DeBERTa-v3-base model, the context length is extended to 1280, and fine-tuned for 250,000 steps on the tasksource dataset, focusing on natural language inference and zero-shot classification tasks.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase